51 research outputs found

    On visually-grounded reference production:testing the effects of perceptual grouping and 2D/3D presentation mode

    Get PDF
    When referring to a target object in a visual scene, speakers are assumed to consider certain distractor objects to be more relevant than others. The current research predicts that the way in which speakers come to a set of relevant distractors depends on how they perceive the distance between the objects in the scene. It reports on the results of two language production experiments, in which participants referred to target objects in photo-realistic visual scenes. Experiment 1 manipulated three factors that were expected to affect perceived distractor distance: two manipulations of perceptual grouping (region of space and type similarity), and one of presentation mode (2D vs. 3D). In line with most previous research on visually-grounded reference production, an offline measure of visual attention was taken here: the occurrence of overspecification with color. The results showed effects of region of space and type similarity on overspecification, suggesting that distractors that are perceived as being in the same group as the target are more often considered relevant distractors than distractors in a different group. Experiment 2 verified this suggestion with a direct measure of visual attention, eye tracking, and added a third manipulation of grouping: color similarity. For region of space in particular, the eye movements data indeed showed patterns in the expected direction: distractors within the same region as the target were fixated more often, and longer, than distractors in a different region. Color similarity was found to affect overspecification with color, but not gaze duration or the number of distractor fixations. Also the expected effects of presentation mode (2D vs. 3D) were not convincingly borne out by the data. Taken together, these results provide direct evidence for the close link between scene perception and language production, and indicate that perceptual grouping principles can guide speakers in determining the distractor set during reference production

    Does Size Matter – How Much Data is Required to Train a REG Algorithm?

    Get PDF
    In this paper we investigate how much data is required to train an algorithm for attribute selection, a subtask of Referring Expressions Generation (REG). To enable comparison between different-sized training sets, a systematic training method was developed. The results show that depending on the complexity of the domain, training on 10 to 20 items may already lead to a good performance

    Eye movements during reference production:Testing the effects of perceptual grouping on referential overspecification

    Get PDF
    When referring to a target object in a visual scene, speakers are assumed to consider certain distractor objects that are visible to be more relevant than others. However, previous research that has tested this assumption has mainly applied offline measures of visual attention, such as the occurrence of overspecification in speakers’ target descriptions. Therefore, in the current study, we take both online (eyetracking) and offline (overspecification) measures of attention, to study how perceptual grouping affects scene perception, and reference production. We manipulated three grouping principles: region of space, type similarity, and color similarity. For all three factors, we found effects, either on eye movements (region of space), overspecification (color similarity), or both (type similarity). The results for type similarity provide direct evidence for the close link between scene perception and reference production

    Effects of domain size during reference production in photo-realistic scenes

    Get PDF
    The current study investigates how speakers are affected by the size of the visual domain during reference production. Previous research found that speech onset times increase along with the number of distractors that are visible, at least when speakers refer to non-salient target objects in simplified visual domains. This suggests that in the case of more distractors, speakers need more time to perform an object-by-object scan of all distractors that are visible. We present the results of a reference production experiment, to study if this pattern for speech onset times holds for photo-realistic scenes, and to test if the suggested viewing strategy is reflected directly in speakers’ eye movements. Our results show that this is indeed the case: we find (1) that speech onset times increase linearly as more distractors are present; (2) that speakers fixate the target relatively less often in larger domains; and (3) that larger domains elicit more fixation switches back and forth between the target and its distractors

    Developmental Changes in Children’s Processing of Redundant Modifiers in Definite Object Descriptions

    Get PDF
    This paper investigates developmental changes in children’s processing of redundant information in definite object descriptions. In two experiments, children of two age groups (6 or 7, and 9 or 10 years old) were presented with pictures of sweets. In the first experiment (pairwise comparison), two identical sweets were shown, and one of these was described with a redundant modifier. After the description, the children had to indicate the sweet they preferred most in a forced-choice task. In the second experiment (graded rating), only one sweet was shown, which was described with a redundant color modifier in half of the cases (e.g., “the blue sweet”) and in the other half of the cases simply as “the sweet.” This time, the children were asked to indicate on a 5-point rating scale to what extent they liked the sweets. In both experiments, the results showed that the younger children had a preference for the sweets described with redundant information, while redundant information did not have an effect on the preferences for the older children. These results imply that children are learning to distinguish between situations in which redundant information carries an implicature and situations in which this is not the case

    Stored object knowledge and the production of referring expressions:The case of color typicality

    Get PDF
    When speakers describe objects with atypical properties, do they include these properties in their referring expressions, even when that is not strictly required for unique referent identification? Based on previous work, we predict that speakers mention the color of a target object more often when the object is atypically colored, compared to when it is typical. Taking literature from object recognition and visual attention into account, we further hypothesize that this behavior is proportional to the degree to which a color is atypical, and whether color is a highly diagnostic feature in the referred-to object's identity. We investigate these expectations in two language production experiments, in which participants referred to target objects in visual contexts. In Experiment 1, we find a strong effect of color typicality: less typical colors for target objects predict higher proportions of referring expressions that include color. In Experiment 2 we manipulated objects with more complex shapes, for which color is less diagnostic, and we find that the color typicality effect is moderated by color diagnosticity: it is strongest for high-color-diagnostic objects (i.e., objects with a simple shape). These results suggest that the production of atypical color attributes results from a contrast with stored knowledge, an effect which is stronger when color is more central to object identification. Our findings offer evidence for models of reference production that incorporate general object knowledge, in order to be able to capture these effects of typicality on determining the content of referring expressions

    Cross-linguistic Attribute Selection for REG: Comparing Dutch and English

    Get PDF
    In this paper we describe a cross-linguistic experiment in attribute selection for referring expression generation. We used a graph-based attribute selection algorithm that was trained and cross-evaluated on English and Dutch data. The results indicate that attribute selection can be done in a largely language independent way
    • …
    corecore